Using Data on Applicants to Training Programs to Measure the Program ’ s Effect on Earnings

نویسنده

  • Glen Cain
چکیده

Using data from the AFDC Homemaker-Home Health Aide Demonstration, a welfare training program which had an experimental design, a method of evaluation is developed for possible application in a nonexperimental setting, where random assignment is not available. Two techniques to control for selection bias are pursued. One uses applicants as the source of one or more comparison groups, which effectively controls for the self-selection process of volunteering. The other uses the subjective ranking of suitability of applicants by program administrators; although crudely measured, this variable takes into account administrative selection procedures. The subjective ranking is added to the more conventional list of independent variables. Two nonexperimental applicant groups, those who were screened out and those who dropped out, appear sufficiently similar to the control group in predicted earnings in the two postprogram years to indicate that the procedure has promise. IRS data are used to measure earnings in the years that preceded, overlapped, and followed the training program. Using Data on Applicants to Training Programs to Measure the Program’s Effect on Earnings I. DEALING WITH SELECTION BIAS IN NONEXPERIMENTAL EVALUATIONS The main purpose of this paper is to propose and examine a method of evaluation research in a nonexperimental setting, in which the advantage of random assignment to treatment and control groups is not available. Another purpose is to use earnings data from the Internal Revenue Service to measure the program’s impact over a longer postprogram period than is commonly available with survey data. The IRS data, in conjunction with data from a previous evaluation of the training program, permit measures of the impact of the program using nonexperimental comparison groups as alternatives to a randomly assigned control group. In nonexperimental evaluations it is widely recognized that persons in comparison groups are likely to differ from program participants in unobserved ways that influence earnings, as a result of the process used to select participants in the program. Although labor economists have applied increasingly sophisticated econometric methods to deal with this problem of selectivity bias, there remains no agreement on which nonexperimental methods, if any, yield valid evaluations. Our proposed method is to model the selection process to take account of the systematic factors that determine who receives the treatment and who does not. By including the variables that systematically determine selection in the model that estimates the treatment effect on the outcome, all excluded variables may be assumed to be, on average, independent of the treatment variable. If the assumption holds, the estimate of the treatment effect should be unbiased. Our application of the method uses data from an experimentally designed evaluation of the AFDC Homemaker-Home Health Aide Demonstration, a training program for women who were AFDC recipients. Information was collected on characteristics of 11,000 program applicants. The 2 women who were selected for the program were trained as homemaker-home health aides to infirm elderly persons in demonstration projects in seven states in the period from 1983 to 1986. Selection bias has two major sources, self-selection and administrative selection. We suggest below several reasons why an evaluation that uses as a comparison group applicants who did not subsequently participate in the training program may reduce the selection bias relative to a comparison group taken from an external data source, such as the Current Population Survey of the Bureau of the Census. In our study the women who applied for training were, as are most applicants, self-selected. Thus, those who did not participate in the training program had in common with the trainees this attribute of volunteering. In some cases volunteering may be a response to particular earnings and employment experiences of the applicants. For example, it has been widely documented that the average earnings of trainees tend to show a "pre-program dip" in the survey year before the training program begins. (See Ashenfelter and Card, 1985, and the studies they cite.) Another similarity among the applicants is their location in the same local labor market. Comparisons between program participants and persons from external data sources are vulnerable to the criticism that particular circumstances of the time and place of the program’s setting and the particular personal characteristics of the participants will cause the participants to have quite different postprogram labor market experiences from those of a comparison group chosen from an external source. The information available in survey sources is unlikely to permit matching on these particularities. Selection from the applicant group into the program, which is usually determined by program administrators, is a second source of selection bias. In our data a subjectively determined ranking of the applicants in terms of their "suitability" or "potential" for succeeding in the training program was used by program administrators as a criterion for selecting applicants into the training program. This 3 ranking variable offers an unusual opportunity to model the administrator’s selection process in a way that accounts for the subjective factors that determine selection in addition to the use of various objective measures that are conventionally available in survey data. Applicants who were screened as eligible for the training program were randomly assigned to a treatment (trainee) group and a control group. About 16 percent of those selected for the treatment group did not attend, a group we label as No Shows. Those who entered the program are called Participants. The program consisted of four to eight weeks of training and up to a year of subsidized work experience providing home care to elderly patients. The random selection process for Controls requires that their outcomes (primarily employment and earnings) be compared to the experiences of the combined group of Participants and No Shows. Although the participants in training programs are ultimately the group of interest, there is no accepted way of matching the Controls with the Participants. Expressed differently, it would be necessary to model the selection process that distinguished the No Shows from the Participants to be able to interpret the Controls as a valid comparison group for the Participants. We do not, however, have specific information about the self-selection that determined the decision of the No Shows to pass up the training. Arguments can be made either for negative selection, such as having low motivation, or for positive selection, such as having found a preferred job. As we show below, the mean values of the subjective rankings and of a variety of objective characteristics among the No Shows, Participants, and Controls are similar. The remaining applicants in the sample fall into three groups: those who were rejected by the screening process (Screen Outs), those who dropped out before the screening process was completed, and a relatively small number whom we termed "not elsewhere classified" (NEC). In this paper we combine the latter two groups and label them Drop Outs. (Since the NEC group was not selected for the experiment and has no record of being screened out, we presume that they dropped out before an 4 administrative decision about their status was made.) Clearly, it is the Screen Outs for whom the subjective ranking of their suitability for the program provides a direct measure, although not a full measure, of the determinants of selection. In contrast, we have no specific information about why the residual group dropped out, and, like the No Shows, their reasons for dropping out may reflect negative or positive selection factors. II. THE DATA AND THE REGRESSION MODEL FOR ESTIMATION The program outcome of interest (dependent variable) for this analysis is annual earnings, obtained from the Internal Revenue Service for the years 1984 to 1988. These years cover most of the period of the training program, which, given the different timings of admission periods across the seven states, span the last months in 1983 and the first months in 1986. Importantly, the IRS data for 1987 and 1988 give us two full years of postprogram experience for all trainees. To protect the confidentiality of the data, the IRS provides mean earnings (and standard deviations) for grouped data consisting of a minimum of 10 persons. The individual records from the training project were linked to the IRS data by social security numbers. For reasons of convenience in data processing, our grouped data, 1,006 units, contain between 10 and 19 persons, and the average group size is 11.1. All groups are homogeneous with respect to the six applicant groups (including those who were "not elsewhere classified"), and most are homogeneous with respect to race (Hispanic-White, Black, and White) and the subjective ranking variable. Four values of the subjective rank of an applicant were used by the program administrators: 1 = most suitable for acceptance, 2 = suitable, 3 = less suitable, and 4 = least suitable. Homogeneity of a characteristic within a group maximizes the variance in the mean values of that characteristic across all 1,006 groups, which will increase the reliability of the estimated effect (coefficient) of that characteristic on the earnings outcome. 5 The IRS earnings are the only earnings information we have for the two nonexperimental groups, Screen Outs and Drop Outs. For the experimental groups we determined that the IRS earnings and the earnings from the survey carried out for the evaluation of the experiment are quite similar for periods of overlap. For 1985, the most complete year for an overlap between the two data sources, the mean survey earnings exceeded the mean IRS earnings by 2 percent. A notable finding in our study is that the IRS is a low-cost source of earnings data for long periods of postprogram experience. Inadequate lengths of posttraining experiences have been a serious weakness in both experimental and nonexperimental evaluation studies. The estimation model we use for evaluation has the following general form: EARNS = A + XB + ZC + DS + u, where EARNS is the mean earnings of the grouped data for the applicants; X is a vector of "objective" variables that represent productivity or earnings traits (and possible selection traits); Z is a vector of five applicant categories defined as dummy variables: Participants, Controls, No Shows, Screen Outs, and Drop Outs, all defined above; S is the subjective ranking of the applicant; A, B, C, and D are parameters to be estimated; and u is an assumed well-behaved error term. Note that the variables EARNS, X, Z, and S are all mean values for the data groups of 10 to 19 applicants. (The Z values are, however, all equal to 1 or 0.) Table 1 shows the sample sizes for the five applicant groups in terms of the number of persons and in terms of the number of data groups that give the mean values for the IRS-reported earnings and other variables. The full sample for estimating the EARNS model consists of the 1,006 observations for the data groups. Two features of the nonexperimental data limit the effective size of the sample for testing comparison groups for a nonexperimental evaluation. First, there are only 61 No-Show groups, 84 6 TABLE 1 Sizes of Applicant Groups Number of Data Groups Cohort 1 Cohort 2 Cohort 3 Applicant Group Number of Persons 1983 Jan. 1984-June 1984 July 1984-May 1985 All Cohorts Participant 3,912 124 90 145 359 Control 4,625 147 107 166 420 No Show 725 21 16 24 61 Screen Out 931 --84 84 Drop Out 909 --82 82 Total 11,102 292 213 501 1006 Of the 82 dropout groups, 63 were originally classified as "drop-outs" and 19 as "not elsewhere classified (NEC)". 7 Screen-Out groups, and 82 Drop-Out groups, compared to 420 Control groups and 359 Participant groups. Second, all data for the Screen Outs and Drop Outs come from the last of the three intake periods for the program, July 1984 to May 1985. (This group of applicants will be referred to as Cohort 3.) None are from the 1983 (Cohort 1) or January-June 1984 (Cohort 2) periods, which yielded cohorts of applicants who had considerably higher earnings than those in Cohort 3. Because the period of application is another characteristic of the selection process, we will confine most of our analysis to Cohort 3. Table 2 shows selected descriptive statistics of the applicant groups, based on the 1,006 observations of data groups. The Participants and Controls are similar in education, race, and subjective rank scores. The Controls would be shown to be even more similar to the combined group of Participants and No Shows, because the screened-in applicants were randomly assigned to either the combined groups or to the Control group. The Screen Outs and the Drop Outs have the lowest average educational attainment and the lowest average subjective rank, recognizing that rank 1 is most suitable. (Mean values of all the measured characteristics of the applicant groups are available from the authors.) The results from estimating the EARNS model are addressed to three main questions. First, which among the three categories--No Shows, Screen Outs, and Drop Outs--works best as an alternative to the experimental control group in estimating the earnings impact of the training program? In simplest terms, does the postprogram earnings record of any of the alternative groups so closely match that of the Controls that it shows promise as a comparison group in a nonexperimental design? Second, to what extent does the addition of the subjective ranking variable, S, improve the comparison with the Controls of these alternative groups? Third, what new information about the earnings differentials between the trainees and the other groups, including the Controls, is provided by the IRS data covering the extended postprogram period? 8 TABLE 2 Selected Characteristics of Applicant Groups Cohort 3 All Cohorts Average Average % Subjective % Subjective Applicant % Education % % Subjective % Education % % Subjective Rank Rank Group < 12 White Black Rank < 12 White Black Rank Unreported = 1, 2 Participant 44 28 56 2.3 44 26 60 2.2 8 80 Control 46 28 55 2.3 45 25 59 2.2 9 78 No Show 50 17 58 2.3 47 17 54 2.3 10 77 Screen Out 50 38 51 3.1 50 38 51 3.1 36 31 Drop Out 52 30 53 3.0 52 30 53 3.0 63 22 Percent with less than a high school degree. Average score of subjective ranking by administrators for suitability for program: 1 is most suitable; 5 is least; 3 is the number assigned to unreported rankings. Percent with subjective rank unreported. Percent with highest rankings (= 1, 2). Of the 82 dropout groups, 63 were originally classified as "drop-outs" and 19 as "not elsewhere classified (NEC)". 9 Note that S, the subjective rank, may be presumed to have been a key determinant of selection into the Screen Out group, but S does not directly determine selection into the No Shows or Drop Outs. Again, we do not have explicit information on the selection process for these two groups. Although our empirical results reported below show that the postprogram earnings of the Controls and the No Shows are similar, we do not have a strong a priori theoretical justification for these results, which may merely reflect this particular program or sample. By contrast, similarity in postprogram earnings between the Controls and Screen Outs, in conjunction with success in using the S variable, has greater practical significance because of its strong theoretical rationalization; specifically, that modeling the selection process is a valid and practical method for obtaining unbiased estimates of treatment effects in a nonexperimental design. The model above is estimated for each of five years, 1984 to 1988. Only the last two years, 1987 and 1988, are, however, strict tests of the Participants’ earnings performance in the posttraining, unsubsidized, period. As will be shown below, earnings of the trainees are substantially higher in the early years, but these results are not necessarily a measure of enhanced earnings capacities in an unsubsidized labor market. We also show that these early years are indeed a period of transitorily low earnings for the groups that did not enter the training program. The modest sample size of data groups for No Shows, Screen Outs, and Drop Outs is one limitation of our estimations. Another is the weaknesses in a key variable, S, measuring the subjective rank of an applicant. First, the metric for S is limited to only four values, 1 to 4. Clearly, a wider variation would be desirable. Note that a two-value rating designating acceptance or rejection would be useless if it were strictly adhered to. Second, the rating is not recorded for many applicants. In our data a missing rating is scored 3, after setting the old ratings of 3 and 4 to equal 4 and 5. A score of 3 effectively assigns a middle (or "neutral") value to a missing observation. The S variable is thus measured with error. To the extent that the error is random, the explanatory power and true 10 coefficient of S in the EARNS model are biased down. As shown in Table 2, the problem of missing values is much worse among the categories for nonexperimental comparison groups. In the three categories that were screened in--Participants, Controls, and No Shows--less than 10 percent of persons have missing values, while the Screen Outs have missing values in 36 percent of the cases, and 63 percent of the Drop Outs have missing values for S. A final shortcoming of the S variable, which is not unexpected, is the concentration of low values (designating suitability) for Participants, Controls, and No Shows, and the concentration of relatively high values of S (not suitable) for Screen Outs and Drop Outs. About 79 percent of the group units for Participants, Controls, and No Shows have a mean value of 1 or 2, but only 27 percent of Screen Outs and Drop Outs have these low values. It is clear that these data, useful as they are for providing some information about modeling selection, hardly give the method of subjective evaluation a fair test of its role in modeling selection. Since the original evaluation design for this program was that of a controlled experiment, it is not surprising that obtaining S ratings for the Screen Outs was not emphasized. The scarcity of ratings for the Drop Outs reflects the fact that limited contact was obtained with this group. III. ESTIMATION RESULTS THAT TEST FOR ALTERNATIVE COMPARISON GROUPS: TABLES 3 TO 7 Tables 3-7 contain the main results for examining our three objectives: (1) using applicants to obtain a valid comparison group in nonexperimental evaluations of training programs; (2) modeling the program administrators’ selection of program participants from among the pool of applicants; specifically, using the administrators’ subjective rankings to eliminate selection bias in estimating the effect of the program; and (3) using IRS data on earnings to measure the long-term impact of the program. We argue below that some degree of success is obtained in each of the three objectives, and 11 that there is useful information in the results even if one were pessimistic about using applicant groups in a nonexperimental evaluation. Tables 3-7 show selected regression models that estimate IRS-recorded annual earnings for each of the five years, 1984 to 1988. The sample for these regressions consists of only Cohort 3, those who applied for the program between July 1984 and May 1985. As noted above, this is the only cohort period with data for Screen Outs and Drop Outs. Thus, 1984 is a year before the training program began for most of the sample. In 1985 some portion of time was spent in training and in both subsidized and unsubsidized employment for some of the Participants. In 1986 most of the trainees were in the unsubsidized labor market for most of the year, although some who entered the program as late as May 1985 were in subsidized jobs in the first part of 1986. The years 1987 and 1988 were a period of unsubsidized employment for all the trainees. The outcomes for 1987 and 1988 provide the sharpest focus on the three objectives posed above, but several comments about the results for all five years are useful. The annual earnings of these women are very low indeed, especially in the early years. In 1984, when all the women in the sample were recipients of AFDC for at least part of the year, the average earnings reported to the IRS were only $776, and the average for the Control group was $749. In the next four years earnings rose sharply for every applicant group. Among the Controls average earnings rose to $1,495 in 1985, $2,473 in 1986, $3,139 in 1987, and $3,896 in 1988. The figures for No Shows, Screen Outs and Drop Outs are similar. Both the distinction between transitory and permanent income levels and the impermanence of AFDC status are evident. The overall low level of earnings and the trend in earnings reveal the sharply depressed economic situation of these women at or near the time of applying for the program. Matching the trainee group from sources other than the applicants would be difficult. 12 The average earnings reported for 1984 in Table 3 are considerably higher for Participants than for Controls, $898 compared to $749. Even the weighted average of the 1984 earnings for Participants and No Shows, with No Shows as 15.5 percent of the combined group, is $869, which is $120 (or 16 percent) higher than the mean for the Controls. Random assignment should have made the preprogram earnings for Controls and the combined Participants plus No Shows similar. The reason for the difference is that some of the Participants benefited from subsidized employment in the last several months of 1984. Thus, 1984 earnings are not a "pure" preprogram measure of earnings for the Participants. Column (1) in each table lists the five applicant groups. In the regression models, dummy variables designate the groups, with Controls as the omitted group. Column (2) reports the mean earnings for each of the five applicant groups in the year for each table. Column (3) reports the coefficients of the four dummy variables for the applicant groups in a regression with no other independent variables. This is a baseline regression that gives the same information as column (2), but (3) provides a comparison for how the annual earnings of the five applicant groups change in the regression models shown in columns (4) to (6), when various sets of explanatory variables are included. (No column with 1984 earnings as an independent variable is listed in Table 3, because 1984 earnings form the dependent variable in this table.) By specifying Controls as the omitted group in the regressions, we interpret each coefficient of the other groups as the group’s earnings differential relative to the Controls. Thus, the relatively large negative coefficient for Screen Outs in column (3) in Tables 3-7 (for each year) shows consistent evidence of "creaming" by the program administrators--that is, selecting (relatively) high-earnings women for the chance to enter the training program and rejecting low-earnings women. Column (4) shows the coefficients of the dummy variables for the applicant groups in a regression that includes a number of worker characteristics that are typical of those used to estimate 13 TABLE 3 Regression Results for Cohort 3 for 1984, One Year before the Training Program Started for Most Participants (Dependent variable = total earnings in 1984; mean = $776; standard errors in parentheses) Applicant Coefficients in Group Regression with Coefficients in and Other Coefficients Conventional Regression with Independent of Dummy Independent Subjective Variables Mean Earnings Variables Variables Ranking (1) (2) (3) (4) (5) Controls $749 ---(63) Participants 898 149 127 126 (93) (92) (92) No Shows 710 -39 -29 -29 (179) (177) (177) Screen Outs 592 -157 -69 -91 (110) (126) (133) Dropouts/NEC 823 74 104 86 (110) (126) (131) Subjective Rank 24 (48) N 499 499 493 493 R .02 .08 .08 For Cohort 3, the application and program assignment took place between July 1984 and May 1985. Regression with only applicant group variables. Conventional independent variables: race, state location, education, marital status, age, number of children, worked for pay before, highest wage attained. (See Appendix for full definitions and coefficients.) Includes all independent variables in column (4) plus subjective ranking of administrator. Intercept term (and standard error in parentheses) from regression (3), with Controls as omitted dummy variable. 14 earnings with survey data. Their intended role in an evaluation analysis is to account for productivity differences (or determinants of earnings) that are not attributable to the training program. The list of variables, referred to in the table as "conventional" explanatory variables, includes the race (or ethnicity), age, marital status, and education of the woman, the number of dependent children she has, her state of residence, whether the woman had ever been employed before, and her highest wage obtained in past employment. These eight types of explanatory variables are, in fact, specified with 22 right-hand-side variables. The full list of variables is shown in Appendix Table A.1, along with their means and estimated coefficients in a representative earnings regression from Table 7. A surprising finding in Tables 3-7 is that the coefficients in column (4) for No Shows, Screen Outs, and Drop Outs are not very different (and sometimes smaller in absolute value) from the corresponding coefficients for these groups in column (3). Evidently, the conventional productivity variables have little systematic effect on narrowing the earnings differential between Controls and Screen Outs (or between the Controls and the other two comparison groups), and, therefore, do not appear to account for the selection bias affecting earnings. The coefficients of the Participant group are discussed later, when the findings for the experimental design of the analysis are considered. At this point we note only that the coefficients of the Participant dummy variable do not much change as explanatory variables are added, as indeed they should not, given random assignments. The small changes that do occur may reflect the nonrandom selection (or separation) of No Shows from the Participants. Column (5) in Tables 4-7 shows the coefficients of the dummy variables for the applicant groups in a regression that adds 1984 earnings to the variables used in column (4). Earnings in 1984 are for the period before the training program began for most of the women in Cohort 3. Although not always available in evaluation analyses, a pretraining earnings variable is recognized to be an important predictor of a person’s productive capacity. Earnings in 1984 are, indeed, a powerful 15 TABLE 4 Regression Results for Cohort 3 for 1985, a Year of Subsidized Employment for Most Participants (Dependent variable = total earnings, 1985; mean = $2,240; standard errors in parentheses) Coefficients in Applicant Regression Coefficients Coefficients in Group with in Regression and Other Coefficients Conventional Regression with Independent Mean of Dummy Independent with 1984 Subjective Variables Earnings Variables Variables Earnings Ranking (1) (2) (3) (4) (5) (6) Controls $1,495 ----(89) Participants 4,273 2,778 2,729 2,608 2,610 (131) (129) (95) (95) No Shows 1,484 -11 139 167 168 (251) (248) (182) (182) Screen Outs 1,154 -341 -399 -333 -263 (155) (177) (130) (137) Dropouts/NEC 1,477 -18 -135 -234 -177 (155) (177) (130) (135) 1984 Total Earnings .95 .96 (.05) (.05) Subjective Rank -78 (49) N 499 499 493 493 493 R .56 .60 .79 .79 For Cohort 3, the application and program assignment took place between July 1984 and May 1985. Regression with only applicant group variables. Conventional independent variables: race, state location, education, marital status, age, number of children, worked for pay before, highest wage attained. (See Appendix for full definitions and coefficients.) Includes all independent variables in column (4) plus 1984 total earnings. Includes all independent variables in column (5) plus subjective ranking. Intercept term (and standard errors in parentheses) from regression (3), with Controls as omitted dummy variable. 16 TABLE 5 Regression Results for Cohort 3 for 1986, a Year of Subsidized Employment for Many Participants (Dependent variable = total earnings, 1986; mean = $2,175; standard errors in parentheses) Coefficients in Applicant Regression Coefficients Coefficients in Group with in Regression and Other Coefficients Conventional Regression with Independent Mean of Dummy Independent with 1984 Subjective Variables Earnings Variables Variables Earnings Ranking (1) (2) (3) (4) (5) (6) Controls $2,473 ----(110) Participants 3,838 1,365 1,283 1,158 1,161 (162) (160) (132) (132) No Shows 2,295 -178 -155 -126 -125 (310) (307) (253) (253) Screen Outs 1,929 -544 -619 -550 -471 (191) (219) (181) (191) Dropouts/NEC 2,139 -334 -517 -620 -555 (192) (219) (180) (187) 1984 Total Earnings .99 .99 (.07) (.07) Subjective Rank -88 (69) N 499 499 493 493 493 R .22 .29 .52 .52 For Cohort 3, the application and program assignment took place between July 1984 and May 1985. Regression with only applicant group variables. Conventional independent variables: race, state location, education, marital status, age, number of children, worked for pay before, highest wage attained. (See Appendix for full definitions and coefficients.) Includes all independent variables in column (4) plus 1984 total earnings. Includes all independent variables in column (5) plus subjective ranking. Intercept term (and standard errors in parentheses) from regression (3), with Controls as omitted dummy variable. 17 TABLE 6 Regression Results for Cohort 3 for 1987, the First Postprogram Year for All Participants (Dependent variable = total earnings, 1987; mean = $3,261; standard errors in parentheses) Coefficients in Applicant Regression Coefficients Coefficients in Group with in Regression and Other Coefficients Conventional Regression with Independent Mean of Dummy Independent with 1984 Subjective Variables Earnings Variables Variables Earnings Ranking (1) (2) (3) (4) (5) (6) Controls $3,139 ----(127) Participants 3,935 796 641 572 578 (187) (178) (161) (161) No Shows 2,951 -188 -267 -236 -235 (358) (341) (308) (307) Screen Outs 2,627 -512 -582 -503 -374 (222) (245) (221) (232) Dropouts/NEC 3,046 -93 -351 -435 -327 (221) (243) (219) (228) 1984 Total Earnings .96 .96 (.09) (.09) Subjective Rank -147 (84) N 495 495 489 489 489 R .07 .21 .36 .36 For Cohort 3, the application and program assignment took place between July 1984 and May 1985. Regression with only applicant group variables. Conventional independent variables: race, state location, education, marital status, age, number of children, worked for pay before, highest wage attained. (See Appendix for full definitions and coefficients.) Includes all independent variables in column (4) plus 1984 total earnings. Includes all independent variables in column (5) plus subjective ranking. Intercept term (and standard error in parentheses) from regression (3), with Controls as omitted dummy variable. 18 TABLE 7 Regression Results for Cohort 3 for 1988, the Second Postprogram Year for All Participants (Dependent variable = total earnings, 1988; mean = $3,989; standard errors in parentheses) Coefficients in Applicant Regression Coefficients Coefficients in Group with in Regression and Other Coefficients Conventional Regression with Independent Mean of Dummy Independent with 1984 Subjective Variables Earnings Variables Variables Earnings Ranking (1) (2) (3) (4) (5) (6) Controls $3,896 ----(148) Participants 4,626 730 597 530 540 (217) (210) (194) (192) No Shows 3,683 -213 -203 -172 -169 (416) (403) (371) (368) Screen Outs 3,376 -520 -646 -560 -300 (256) (287) (265) (277) Dropouts/NEC 3,765 -131 -373 -462 -248 (257) (287) (264) (272) 1984 Total Earnings 1.02 1.02 (.11) (.11) Subjective Rank -292 (100) N 498 498 492 492 492 R .05 .17 .29 .31 For Cohort 3, the application and program assignment took place between July 1984 and May 1985. Regression with only applicant group variables. Conventional independent variables: race, state location, education, marital status, age, number of children, worked for pay before, highest wage attained. (See Appendix for full definitions and coefficients.) Includes all independent variables in column (4) plus 1984 total earnings. Includes all independent variables in column (5) plus subjective ranking. Intercept term (and standard error in parentheses) from regression (3), with Controls as omitted dummy variable. 19 predictor with these grouped data, even four years later, in 1988. The regression coefficient in Tables 4 to 7 shows that for each additional one dollar of earnings in 1984, the current year’s earnings are expected to be one dollar higher, and the t-ratio for the coefficient is about 10, which is remarkably high. Evidently, the use of group means as the observational unit is responsible for the strong and stable relationship between past and current earnings. No such stability between a prior year’s earnings and later year’s earnings would be found in a regression with individual persons as the units of observation. Finally, column (6) shows the coefficients of the dummy variables for the applicant groups in a regression that adds the subjective ranking, which is not usually available and which offers a unique opportunity for modeling the selection process among applicants to the various groups. Given the presence of the other explanatory variables, subjective ranking has a large and statistically significant effect (in the expected direction) on 1987 and 1988 earnings, and smaller and only marginally significant effects on earnings in 1985 and 1986. (It has essentially a zero effect on 1984 earnings.) Column (6) is critical for testing whether any of the nonexperimental comparison groups is notably improved as an alternative to the experimental Controls when the subjective ranking is "held constant." We see below that this improvement occurs, notably for the Screen Outs, for whom the subjective ranking is particularly relevant. Consider Tables 5 and 6, which cover the two years, 1987 and 1988, that are entirely posttraining years. Among the three applicant groups that are alternatives to the Control group, the No Shows (who were selected to participate in the training program) have earnings that are most similar to the Controls. The earnings of the No Shows are moderately less than those for the Controls, after including all available independent variables in the regression: $235 less in 1987 and $169 less in 1988. Neither difference is statistically significant, but there may be a practical difference in the 20 percentage differences: 4 percent less in 1988 and 7 percent less in 1987, particularly if differences of even this small amount were to persist over many years. The earnings of the No Shows are also more similar to those of the Controls for each of the other years, 1984 to 1986, than are the earnings of the Screen Outs or Drop Outs. As noted above, however, we are reluctant to use the earnings similarity between the No Shows and Controls to claim that No Shows provide a nonexperimental comparison group that would be generally valid, because the theoretical basis for the similarity is not firm, and the results we find may be peculiar to our data set. It is true that the No Shows are like the Control and Participant groups in having been selected into the program, but we have no specific information on their reasons for dropping out. The subjective ranking variable and 1984 earnings (for the preprogram year) are virtually the same, so there is little scope for modeling the selection process in the no-show decision. Our attempt to model the selection process for the Screen Outs shows mixed results in terms of securing a nonexperimental comparison group, although on balance we believe that the results are quite promising. Although the (negative) earnings differences between Screen Outs and Controls are still of practical significance by 1987 and 1988 -$374, or 12 percent, less in 1987, and $300, or 8 percent, less in 1988--the use of the subjective ranking variable sharply reduces the earnings differences. Compare the coefficients for the Screen Out dummy variable in column (6), when the subjective ranking is included, with its coefficients in column (5), when the subjective ranking variable is not included. The Control-Screen Out differential is reduced by 26 percent in 1987 (from $503 to $374) and by 46 percent in 1988 (from $560 to $300). We suggest that if even this crudely measured subjective variable has this much success in explaining the differences between applicants who are selected to participate in the program from those who are not selected, there is reason to believe that a more thorough measurement and modeling of the selection process would provide a valid comparison group in a nonexperimental design. 21 The subjective ranking variable has both practical and statistical significance in the 1987 and 1988 regressions. For each one unit of the ranking, from 1 to 2 or from 4 to 5, for example, the mean earnings of a group are predicted to decline by $147 in 1987 (a 5 percent decline relative to the Control group mean) and by $292 in 1988 (a 7 percent decline). As shown in Table 1, the mean subjective ranking for the Controls was 2.3 and that for Screen Outs was 3.1. (Recall that the value of 3 is assigned to missing values of the ranking, which serves to increase slightly the mean for the Controls and to decrease considerably the mean for the Screen Outs.) The Drop Out comparison group is shown to be quite similar to the Screen Outs. Furthermore, even though only 37 percent of the Drop Outs were assigned a subjective ranking by the program administrators, using this variable also reduced the difference between the Control earnings and the Drop Out earnings by as much in percentage terms as it did for the Screen Outs. Thus, there is also promise in considering this nonexperimental group as a comparison group, particularly if some information on a subjective ranking can be obtained. Based on the data in this study, the Screen Outs and Drop Outs could be combined, if increasing sample size was important. IV. ESTIMATION RESULTS OF PROGRAM IMPACTS: TABLES 8 AND 9 Table 8 concentrates on the estimates of the earnings effect of the training program when the Screen Outs or No Shows are used as comparison groups, using the observations for all cohorts in addition to those for Cohort 3. For Cohort 3 the regressions are similar to those in Tables 3-7 except that separate regressions are fit with just Screen Outs and Participants or with just No Shows and Participants. Using separate regressions for each pair of applicant groups in Table 8 allows all the independent variables in the model to affect EARNS additively for just the two applicant groups, whereas in Tables 3-7 the independent variables are additive across all the applicant groups. To the extent that there are interactions between the independent variables and the applicant-group dummy 22 TABLE 8 Program Participation Effect: Earnings Differences between Participants as Compared to Screen Outs and No Shows (Dependent variable = total earnings; standard errors in parentheses) Cohort 3 All Cohorts Screen Outs No Shows Year No Shows (1) (2) (3) 1984: Preprogram year 270 66 1984: Mostly 1,272 (158) (217) in-program year (164) 1985: In-program year 2,969 2,047 1985: Mix of in1,763 (175) (236) and postprogram years (210) 1986: Mix of inand 1,698 1,094 1986: Mostly 675 postprogram years (233) (303) postprogram year (244) 1987: Postprogram year 1,004 859 1987: Postprogram 716 (252) (345) year (255) 1988: Postprogram year 767 626 1988: Postprogram 613 (299) (403) year (285) Sample Sizes 224-228 164-165 411-414 All coefficients are taken from a regression with all appropriate independent variables. The regressions for Cohort 3 include 1984 total earnings except for the regression for 1984. The regressions for all cohorts exclude 1984 total earnings as an independent variable because 1984 earnings are an outcome of the training program for Cohorts 1 and 2. Some years had a few observations deleted because of obvious errors in the earnings or an excessive amount of missing values among the independent variables. 23 variables, the results from Tables 3-7 and Table 8 (for Cohort 3) will differ. The differences turn out to be minor. Nevertheless, an important conceptual issue is illustrated by Table 8, which is that in a nonexperimental design only the Participants are the policy-relevant group to which comparisons are made. Recall that in the experimental design the Controls are compared with the combined groups of Participants and No Shows. In a nonexperimental design there are no Controls, and the No Shows are considered as a potential comparison group. In Table 8 the figures for Screen Outs and for No Shows express differences relative to the earnings of Participants, where positive values measure an excess or gain of Participants. The results for Cohort 3 in Table 8, as in Tables 3-7, show positive effects of the training program that are very high in 1985, when many of the Participants are in subsidized employment. The earnings gain of the Participants declines over the next three years, but even by 1988 the gain is statistically and practically significant and larger than the gains relative to the Control group, which will be discussed in Table 9. There are data for Participants and No Shows for all three cohorts, and the EARNS regressions for these data are shown in column (3) of Table 8, using the regression with all the independent variables except for 1984 earnings. Recall that 1984 earnings for Participants in Cohort 1, and to a lesser extent in Cohort 2, are partly attributable to the subsidized employment component of the training program. Thus, the earnings advantage of the Participants in 1984, $1,272 in column (3) of Table 8, reflects the subsidized earnings and explains why the Participant advantage in (mainly) preprogram earnings for 1984 is so much less for Cohort 3. The earnings gains of the Participants for 1985 to 1988 are also less than those for Cohort 3, shown in column (2), but the year-by-year decline in the gain and the values for 1987 and 1988 are not sharply different between columns (2) and (3). 24 TABLE 9 Experimental Effects: Earnings Differences between Controls and Combined Participants and No Shows (P+NS), Cohort 3 and All Cohorts, by Year (Dependent variable = total earnings; standard errors in parentheses) Cohort 3 All Cohorts Year; Selected Coefficients in Coefficients in Independent Coefficients in Regression with All Coefficients in Regression with All Variables Simple Regression Independent Variables Simple Regression Independent Variables 1984: Intercept 749 -1,138 (74) -(P+NS) 122 (92) 104 (91) 1,136 (104) 1,143 (77) Subjective Rank -1 (64) -136 (84) 1985: Intercept 1,495 -2,271 (77) -(P+NS) 2,382 (153) 2,237 (120) 1,608 (109) 1,603 (102) Subjective Rank -65 (84) -229 (71) 1984 Total Earnings .98 (.08) 1986: Intercept 2,473 -3,039 (83) -(P+NS) 1,146 (167) 948 (137) 741 (78) 718 (108) Subjective Rank -49 (97) -200 (75) 1984 Total Earnings 1.02 (.09) 1987: Intercept 3,139 -3,746 (94) -(P+NS) 655 (185) 433 (160) 400 (133) 345 (119) Subjective Rank -166 (113) -251 (83) 1984 Total Earnings 1.05 (.12) 1988: Intercept 3,896 -4,374 (102) -(P+NS) 595 (217) 408 (195) 524 (144) 480 (128) Subjective Rank -263 (137) -283 (89) 1984 Total Earnings 1.14 (.15) Sample sizes n = 326-331 n = 826-829 In the simple regression, the intercept = Control mean of total earnings. (No standard error is shown.) Individual variables include: race, state, education, marital status, age, children, worked before, highest wage before, subjective rank, and 1984 total earnings (except in the 1984 regression). Intercept term is not shown. Includes all in b except 1984 total earnings. Intercept term is not shown. Some years had a few observations deleted because of obvious errors in the earnings or an excessive amount of missing values among the independent variables. 25 Table 9 concentrates on the classical experimental results between the Control group and the "treatment" group, where the treatment group is composed of Participants and No Shows, the two groups randomly assigned to the training program. Again, comparisons are shown for Cohort 3 and all cohorts, and again the results for Cohort 3 are similar to those that can be obtained from Tables 37. In Table 9 separate regressions are estimated with just the two applicant groups, now treating Participants and No Shows (P+NS) as a single group. Controls are the omitted group in the regressions, so the positive coefficients of the (P+NS) dummy variable reflect the earnings advantage of the (P+NS) group. Results from a simple regression, using only a single dummy variable for (P+NS), are shown along with the multiple regression using all the independent variables, including the subjective ranking variable, S, and 1984 earnings for Cohort 3, when the 1984 earnings can be considered to be (mainly) preprogram earnings. It is interesting that the subjective ranking is consistent in sign and usually statistically significant as a predictor of earnings, even though the variation of S within the screened-in groups is relatively small. The following three findings from Table 9 are especially noteworthy. Finding 1. The program gains in Table 9 are smaller than the Participant-Control differences reported earlier, which reflects the fact that the No Shows earned less than the Participants in each year. We have no way of knowing whether the earnings advantage of Participants relative to the No Shows is entirely attributable to the gains from training or partially attributable to a positive selection bias for the Participants. Assuming that the smaller estimated gain for the (P+NS) group is unbiased and assuming that there is no effect of the training program or of the subsequent experiences of the trainees on the No Shows, we can attribute all of the (P+NS) gain to the training program by multiplying it by the reciprocal of the ratio, P/(P+NS). For both Cohort 3 and all cohorts, the Participants were approximately 85.5 percent of the total, so the multiplier is 1.17. Thus, all estimated coefficients of the (P+NS) dummy variable in Table 9 could be multiplied by 1.17 to obtain an upwardly adjusted measure of the earnings effect of the training program. 26 Because Table 8 shows a direct comparison between alternative comparison groups and Participants, which, again, would be the only comparison available in a nonexperimental evaluation, the upwardly adjusted estimates of the earnings gains from training from Table 9 may be considered the appropriate comparison to make with Table 8. For example, the 1988 gains from the training program in Table 9 are $480 for all cohorts and $408 for Cohort 3, and multiplied by 1.17 these would be $562 and $477. From Table 8, the gains using No Shows as a comparison group are $613 for all cohorts and $647 for Cohort 3, and the gain using Screen Outs is $767 for Cohort 3. Generally, the upwardly adjusted estimates of the earnings effect of the training program from Table 9 are smaller than the estimates in Table 8. This is consistent with the findings in Tables 3-7, which show the earnings of the Screen Outs and No Shows to be less than the earnings of Controls. Finding 2. Another result from Table 9, again consistent with the previous tables, is the smaller earnings gain from training in the later periods, 1987 and 1988, when all subsidized employment had ended for Cohort 3, and from 1986 to 1988, when all or most of the subsidized employment had ended for all cohorts. It is likely that many training programs bring about increased earnings from job placement effects as well as from increases in earnings capacity (or gains in human capital), and we should expect that the job placement effects on earnings will decay with time, in contrast with human capital effects, which should hold fairly steady over time. If this interpretation of program effects is correct, there is a clear need to obtain employment and earnings information for a reasonably long period after the program ends--at least two or three years. Finding 3. Finally, Table 9 shows modest but noticeably larger estimates of the earnings effect of the training program in the simple regression than in the multiple regression, especially for Cohort 3. This is unexpected if assignments were truly random and the samples are large. Again, this result was somewhat in evidence in Tables 3-7, where there was a rather consistent decline in the coefficient of Participants when comparing the simple regression result (in column 3) with the results in columns 4, 5, and 6, particularly for the years from 1986 to 1988. Several explanations are possible, including 27 sampling variability or other sources of an initial selection that favored the Participants. We conclude that close attention to the selection process is important in any evaluation analysis.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Effect of Cardiac Rehabilitation Program Based on Combined Training on VEGF/Endostatin Gene Expression Ratio in Patients with Acute Coronary Syndrome

Background: Coronary artery disease is one of the most common causes of death in the world. With the increase in the incidence of these diseases, surgical and non-surgical interventions followed by cardiovascular rehabilitation programs have become more important. The process of angiogenesis and improvement of blood flow is considered as one of the therapeutic goals in these patients, and vascu...

متن کامل

The effect of virtual reality training program on the Functional Fitness of the elderly

Introduction: The aging populations around the world are rising dramatically. Decreased age-related functional fitness that including muscle strength, flexibility, balance, agility, speed, and aerobic endurance, negatively affects the quality of life. Therefore, the purpose of this study was to investigate the effect of virtual reality exercises on the elderlychr('39')s functional fitness. Meth...

متن کامل

Effect of acceptance and commitment based training on psychological well being and marital satisfaction in divorce applicants couples

Divorce is a fact that family institution has been experiencing it in today’s world. In Iran society due to the importance of family, any damage to this institution is unfortunate. The aim of current study was the surveying of effect of acceptance and commitment based training on psychological well being and marital satisfaction in divorce applicants couples. The design of the study is as exper...

متن کامل

Analyzing the Incremental Information Content of Earnings Downside Risk in Explaining the Cost of Capital

    The purpose of this study is to investigate the effect of a new measure of risk, the earnings downside risk on capital costs, and comparing the incremental information content of this measure to other risk metrics. accordingly, two hypotheses were defined and the effect of the earnings downside risk on the cost of capital as well as the information content of this measure in relation to the...

متن کامل

The Effect of a Training Program Based on the Theory of Planned Behaviour (TPB) on Sexual High-Risk Behavioural Intentions in Female Prisonors, Vakil Abad Prison, Mashhad, Iran, 2013

Background and Aims: High-risk sexual behaviors are of the most important risky behaviours in the area of sexual and reproductive health. It seems that educational programs based on health behavioural change theories are of the most basic measures in prevention of social damages. Therefore, the present study was carried out to determine the effect of a training...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1993